--- title: IceVision Bboxes - CycleGAN Data keywords: fastai sidebar: home_sidebar nb_path: "nbs/IceVision-on-espiownage-cyclegan.ipynb" ---
{% raw %}
{% endraw %}

This is a mashup of IceVision's "Custom Parser" example and their "Getting Started (Object Detection)" notebooks, to analyze SPNet Real dataset, for which I generated bounding boxes. -- S.H. Hawley, July 1, 2021

{% raw %}
 
{% endraw %}

Installing IceVision and IceData

If on Colab run the following cell, else check the installation instructions

{% raw %}
#try:
#    !wget https://raw.githubusercontent.com/airctic/icevision/master/install_colab.sh
#    !chmod +x install_colab.sh && ./install_colab.sh
#except:
#    print("Ignore the error messages and just keep going")
{% endraw %} {% raw %}
import torch, re 
tv, cv = torch.__version__, torch.version.cuda
tv = re.sub('\+cu.*','',tv)
TORCH_VERSION = 'torch'+tv[0:-1]+'0'
CUDA_VERSION = 'cu'+cv.replace('.','')

print(f"TORCH_VERSION={TORCH_VERSION}; CUDA_VERSION={CUDA_VERSION}")

!pip install -qq mmcv-full=="1.3.8" -f https://download.openmmlab.com/mmcv/dist/{CUDA_VERSION}/{TORCH_VERSION}/index.html --upgrade
!pip install mmdet -qq
TORCH_VERSION=torch1.8.0; CUDA_VERSION=cu102
{% endraw %}

Imports

As always, let's import everything from icevision. Additionally, we will also need pandas (you might need to install it with pip install pandas).

{% raw %}
from icevision.all import *
import pandas as pd
INFO     - Downloading mmdet configs | icevision.models.mmdet.download_configs:download_mmdet_configs:31
{% endraw %}

Download dataset

We're going to be using a small sample of the chess dataset, the full dataset is offered by roboflow here

{% raw %}
!rm -rf  /root/.icevision/data/espiownage-cyclegan
rm: cannot remove '/root/.icevision/data/espiownage-cyclegan': Permission denied
{% endraw %} {% raw %}
#data_dir = icedata.load_data(data_url, 'chess_sample') / 'chess_sample-master'

# SPNET Real Dataset link (currently proprietary, thus link may not work)
#data_url = "https://hedges.belmont.edu/~shawley/spnet_sample-master.zip"
#data_dir = icedata.load_data(data_url, 'spnet_sample') / 'spnet_sample-master' 

# espiownage cyclegan dataset:
data_url = 'https://hedges.belmont.edu/~shawley/espiownage-cyclegan.tgz'
data_dir = icedata.load_data(data_url, 'espiownage-cyclegan') / 'espiownage-cyclegan'
{% endraw %}

Understand the data format

In this task we were given a .csv file with annotations, let's take a look at that.

!!! danger "Important"
Replace source with your own path for the dataset directory.

{% raw %}
df = pd.read_csv(data_dir / "bboxes/annotations.csv")
df.head()
filename width height label xmin ymin xmax ymax
0 steelpan_0000000.png 512 384 10 130 114 265 281
1 steelpan_0000000.png 512 384 4 272 37 377 178
2 steelpan_0000000.png 512 384 10 415 292 480 353
3 steelpan_0000000.png 512 384 10 36 21 109 158
4 steelpan_0000002.png 512 384 2 100 161 163 218
{% endraw %}

At first glance, we can make the following assumptions:

  • Multiple rows with the same filename, width, height
  • A label for each row
  • A bbox [xmin, ymin, xmax, ymax] for each row

Once we know what our data provides we can create our custom Parser.

{% raw %}
set(np.array(df['label']).flatten())
{2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22}
{% endraw %} {% raw %}
#df['label'] = ["Object"]*len(df)#  "_"+df['label'].apply(str)   # force label to be string-like
{% endraw %} {% raw %}
df['label'] /= 2
#df.head()
df['label'] = df['label'].apply(int) 
print(set(np.array(df['label']).flatten()))
df['label'] = "_"+df['label'].apply(str)+"_"
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
{% endraw %} {% raw %}
df.head()
filename width height label xmin ymin xmax ymax
0 steelpan_0000000.png 512 384 _5_ 130 114 265 281
1 steelpan_0000000.png 512 384 _2_ 272 37 377 178
2 steelpan_0000000.png 512 384 _5_ 415 292 480 353
3 steelpan_0000000.png 512 384 _5_ 36 21 109 158
4 steelpan_0000002.png 512 384 _1_ 100 161 163 218
{% endraw %} {% raw %}
df['label'] = 'AN'  # antinode
df.head()
filename width height label xmin ymin xmax ymax
0 steelpan_0000000.png 512 384 AN 130 114 265 281
1 steelpan_0000000.png 512 384 AN 272 37 377 178
2 steelpan_0000000.png 512 384 AN 415 292 480 353
3 steelpan_0000000.png 512 384 AN 36 21 109 158
4 steelpan_0000002.png 512 384 AN 100 161 163 218
{% endraw %}

Create the Parser

The first step is to create a template record for our specific type of dataset, in this case we're doing standard object detection:

{% raw %}
template_record = ObjectDetectionRecord()
{% endraw %}

Now use the method generate_template that will print out all the necessary steps we have to implement.

{% raw %}
Parser.generate_template(template_record)
class MyParser(Parser):
    def __init__(self, template_record):
        super().__init__(template_record=template_record)
    def __iter__(self) -> Any:
    def __len__(self) -> int:
    def record_id(self, o: Any) -> Hashable:
    def parse_fields(self, o: Any, record: BaseRecord, is_new: bool):
        record.set_img_size(<ImgSize>)
        record.set_filepath(<Union[str, Path]>)
        record.detection.set_class_map(<ClassMap>)
        record.detection.add_labels(<Sequence[Hashable]>)
        record.detection.add_bboxes(<Sequence[BBox]>)
{% endraw %}

We can copy the template and use it as our starting point. Let's go over each of the methods we have to define:

  • __init__: What happens here is completely up to you, normally we have to pass some reference to our data, data_dir in our case.

  • __iter__: This tells our parser how to iterate over our data, each item returned here will be passed to parse_fields as o. In our case we call df.itertuples to iterate over all df rows.

  • __len__: How many items will be iterating over.

  • imageid: Should return a Hashable (int, str, etc). In our case we want all the dataset items that have the same filename to be unified in the same record.

  • parse_fields: Here is where the attributes of the record are collected, the template will suggest what methods we need to call on the record and what parameters it expects. The parameter o it receives is the item returned by __iter__.

!!! danger "Important"
Be sure to pass the correct type on all record methods!

{% raw %}
# but currently not a priority!
class ChessParser(Parser):
    def __init__(self, template_record, data_dir):
        super().__init__(template_record=template_record)
        
        self.data_dir = data_dir
        self.df = pd.read_csv(data_dir / "bboxes/annotations.csv")
        #self.df['label'] /= 2
        #self.df['label'] = self.df['label'].apply(int) 
        #self.df['label'] = "_"+self.df['label'].apply(str)+"_"
        self.df['label'] = 'AN'  # make them all the same object
        self.class_map = ClassMap(list(self.df['label'].unique()))
        
    def __iter__(self) -> Any:
        for o in self.df.itertuples():
            yield o
        
    def __len__(self) -> int:
        return len(self.df)
        
    def record_id(self, o) -> Hashable:
        return o.filename
        
    def parse_fields(self, o, record, is_new):
        if is_new:
            record.set_filepath(self.data_dir / 'images' / o.filename)
            record.set_img_size(ImgSize(width=o.width, height=o.height))
            record.detection.set_class_map(self.class_map)
        
        record.detection.add_bboxes([BBox.from_xyxy(o.xmin, o.ymin, o.xmax, o.ymax)])
        record.detection.add_labels([o.label])
{% endraw %}

Let's randomly split the data and parser with Parser.parse:

{% raw %}
parser = ChessParser(template_record, data_dir)
{% endraw %} {% raw %}
train_records, valid_records = parser.parse()
INFO     - Autofixing records | icevision.parsers.parser:parse:136
{% endraw %}

Let's take a look at one record:

{% raw %}
show_record(train_records[5], display_label=False, figsize=(14, 10))
{% endraw %} {% raw %}
train_records[0]
BaseRecord

common: 
	- Record ID: 500
	- Image size ImgSize(width=512, height=384)
	- Filepath: /home/shawley/.icevision/data/espiownage-cyclegan/espiownage-cyclegan/images/steelpan_0000582.png
	- Img: None
detection: 
	- Class Map: <ClassMap: {'background': 0, 'AN': 1}>
	- Labels: [1]
	- BBoxes: [<BBox (xmin:107, ymin:41, xmax:274, ymax:190)>]
{% endraw %}

Moving On...

Following the Getting Started "refrigerator" notebook...

{% raw %}
# size is set to 384 because EfficientDet requires its inputs to be divisible by 128
image_size = 384  
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=image_size, presize=512), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(image_size), tfms.A.Normalize()])

# Datasets
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
{% endraw %}

this next cell generates an error. ignore it and move on

{% raw %}
samples = [train_ds[0] for _ in range(3)]
show_samples(samples, ncols=3)
{% endraw %} {% raw %}
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x(pretrained=True)
{% endraw %} {% raw %}
selection = 0


extra_args = {}

if selection == 0:
  model_type = models.mmdet.retinanet
  backbone = model_type.backbones.resnet50_fpn_1x

elif selection == 1:
  # The Retinanet model is also implemented in the torchvision library
  model_type = models.torchvision.retinanet
  backbone = model_type.backbones.resnet50_fpn

elif selection == 2:
  model_type = models.ross.efficientdet
  backbone = model_type.backbones.tf_lite0
  # The efficientdet model requires an img_size parameter
  extra_args['img_size'] = image_size

elif selection == 3:
  model_type = models.ultralytics.yolov5
  backbone = model_type.backbones.small
  # The yolov5 model requires an img_size parameter
  extra_args['img_size'] = image_size

model_type, backbone, extra_args
(<module 'icevision.models.mmdet.models.retinanet' from '/home/shawley/envs/icevision/lib/python3.8/site-packages/icevision/models/mmdet/models/retinanet/__init__.py'>,
 <icevision.models.mmdet.models.retinanet.backbones.resnet_fpn.MMDetRetinanetBackboneConfig at 0x7fbc563efc10>,
 {})
{% endraw %} {% raw %}
model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), **extra_args) 
/home/shawley/envs/icevision/lib/python3.8/site-packages/mmdet/core/anchor/builder.py:16: UserWarning: ``build_anchor_generator`` would be deprecated soon, please use ``build_prior_generator`` 
  warnings.warn(
Use load_from_local loader
The model and loaded state dict do not match exactly

size mismatch for bbox_head.retina_cls.weight: copying a param with shape torch.Size([720, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([9, 256, 3, 3]).
size mismatch for bbox_head.retina_cls.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([9]).
{% endraw %} {% raw %}
train_dl = model_type.train_dl(train_ds, batch_size=8, num_workers=4, shuffle=True)
valid_dl = model_type.valid_dl(valid_ds, batch_size=8, num_workers=4, shuffle=False)
{% endraw %} {% raw %}
#model_type.show_batch(first(valid_dl), ncols=4)
{% endraw %} {% raw %}
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
{% endraw %} {% raw %}
learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics)
{% endraw %} {% raw %}
learn.lr_find()

# For Sparse-RCNN, use lower `end_lr`
# learn.lr_find(end_lr=0.005)
/home/shawley/envs/icevision/lib/python3.8/site-packages/mmdet/core/anchor/anchor_generator.py:324: UserWarning: ``grid_anchors`` would be deprecated soon. Please use ``grid_priors`` 
  warnings.warn('``grid_anchors`` would be deprecated soon. '
/home/shawley/envs/icevision/lib/python3.8/site-packages/mmdet/core/anchor/anchor_generator.py:360: UserWarning: ``single_level_grid_anchors`` would be deprecated soon. Please use ``single_level_grid_priors`` 
  warnings.warn(
SuggestedLRs(lr_min=8.317637839354575e-05, lr_steep=9.120108734350652e-05)
{% endraw %} {% raw %}
learn.fine_tune(60, 1e-4, freeze_epochs=2)
epoch train_loss valid_loss COCOMetric time
0 0.976641 0.642729 0.323682 00:25
1 0.596252 0.501278 0.451523 00:24
epoch train_loss valid_loss COCOMetric time
0 0.452524 0.399249 0.575420 00:27
1 0.426263 0.365994 0.604415 00:27
2 0.388280 0.351287 0.610145 00:27
3 0.364457 0.342985 0.600091 00:28
4 0.348127 0.326428 0.614855 00:27
5 0.345371 0.306980 0.649200 00:27
6 0.341495 0.294043 0.651103 00:27
7 0.329522 0.309016 0.636205 00:27
8 0.318329 0.276703 0.667442 00:27
9 0.316783 0.330467 0.555162 00:27
10 0.310610 0.274497 0.655791 00:27
11 0.289824 0.259798 0.678061 00:27
12 0.289236 0.251583 0.689259 00:27
13 0.284982 0.267503 0.657997 00:27
14 0.281995 0.250101 0.703570 00:27
15 0.272036 0.235415 0.707341 00:27
16 0.267414 0.234210 0.707327 00:27
17 0.262876 0.222484 0.722220 00:27
18 0.256878 0.229241 0.717355 00:27
19 0.253481 0.242169 0.692606 00:27
20 0.250192 0.220945 0.732879 00:27
21 0.244996 0.213427 0.742637 00:27
22 0.243932 0.225125 0.721749 00:27
23 0.236347 0.211556 0.737580 00:27
24 0.240974 0.208612 0.740199 00:27
25 0.236164 0.213796 0.737580 00:27
26 0.230545 0.218942 0.732357 00:27
27 0.235224 0.207850 0.737813 00:27
28 0.224442 0.212759 0.728622 00:27
29 0.219813 0.209624 0.738186 00:27
30 0.216286 0.208520 0.741145 00:27
31 0.216047 0.200986 0.749800 00:27
32 0.211134 0.208006 0.733276 00:27
33 0.209929 0.209124 0.722883 00:27
34 0.211524 0.203921 0.742878 00:27
35 0.209660 0.213367 0.728206 00:27
36 0.201517 0.200547 0.739839 00:27
37 0.198767 0.196269 0.749959 00:27
38 0.199270 0.198263 0.749550 00:27
39 0.198106 0.197246 0.747433 00:27
40 0.193487 0.202406 0.741240 00:27
41 0.192600 0.197251 0.751214 00:27
42 0.188518 0.198496 0.744129 00:27
43 0.190307 0.201342 0.745648 00:27
44 0.191269 0.197905 0.747522 00:27
45 0.188683 0.198918 0.752821 00:27
46 0.183905 0.198637 0.741208 00:27
47 0.182821 0.196345 0.747649 00:27
48 0.185515 0.195617 0.748123 00:27
49 0.175407 0.194983 0.749370 00:27
50 0.180836 0.199225 0.746571 00:27
51 0.172974 0.198520 0.740187 00:27
52 0.174329 0.196869 0.748899 00:27
53 0.181604 0.196888 0.748835 00:27
54 0.177612 0.197115 0.747597 00:27
55 0.176821 0.197122 0.744587 00:27
56 0.179911 0.197167 0.748072 00:27
57 0.173741 0.196706 0.747585 00:27
58 0.183870 0.196494 0.747673 00:27
59 0.176335 0.196476 0.746890 00:27
{% endraw %} {% raw %}
model_type.show_results(model, valid_ds, detection_threshold=.5)
{% endraw %}

Next steps

  • This was just merged, come help us adjusting the documentation and fixing the bugs

Conclusion

And that's it! Now that you have your data in the standard library record format, you can use it to create a Dataset, visualize the image with the annotations and basically use all helper functions that IceVision provides!

Happy Learning!

If you need any assistance, feel free to join our forum.